Empirical Policy Evaluation With Supergraphs
نویسندگان
چکیده
We devise algorithms for the policy evaluation problem in reinforcement learning, assuming access to a simulator and certain side information called supergraph. Our explore backward from high-cost states find high-value ones, contrast approaches that work forward all states. While several papers have demonstrated utility of exploration empirically, we conduct rigorous analyses which show our can reduce average-case sample complexity O(S logS) as low O(logS). Analytically, adapt tools network science literature provide new methodology learning problems.
منابع مشابه
Fault tolerant supergraphs with automorphisms
Given a basic graph Y and a desired level of fault-tolerance k, an objective in fault-tolerant system design is to construct a supergraph X such that the removal of any k nodes from X leaves a graph containing Y . In order to reconfigure around faults when they occur, it is also required that any two subsets of k nodes of X are in the same orbit of the action of its automorphism group. In this ...
متن کاملPolicy Evaluation and Empirical Growth Research
This paper explores the implications of the vast body of studies of cross-country growth determinants for the evaluation of alternative policies. Empirical growth studies have experienced a remarkable flowering in the last fifteen years, and innumerable insights have unquestionably been uncovered concerning similarities and differences in the growth experiences of various groups of countries. T...
متن کاملGoal directed policy conflict detection and prioritisation: an empirical evaluation
We address the problem of developing effective automated reasoning support for the detection and resolution of conflicts between plans and policies (or norms). How automated reasoning mechanisms can effectively support human decision makers in this process is little understood. In this research, we have conducted experiments with human subjects to assess how effective these reasoning mechanisms...
متن کاملOn Supergraphs Satisfying CMSO Properties
Let CMSO denote the counting monadic second order logic of graphs. We give a constructive proof that for some computable function f , there is an algorithm A that takes as input a CMSO sentence φ, a positive integer t, and a connected graph G of maximum degree at most ∆, and determines, in time f(|φ|, t) · 2O(∆·t) · |G|O(t), whether G has a supergraph G′ of treewidth at most t such that G′ |= φ...
متن کاملNowhere-zero k-flows of Supergraphs
Let G be a 2-edge-connected graph with o vertices of odd degree. It is well-known that one should (and can) add o 2 edges to G in order to obtain a graph which admits a nowhere-zero 2-flow. We prove that one can add to G a set of ≤ b o 4c, d2b o 5ce, and d2b o 7ce edges such that the resulting graph admits a nowhere-zero 3-flow, 4-flow, and 5-flow, respectively.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE journal on selected areas in information theory
سال: 2021
ISSN: ['2641-8770']
DOI: https://doi.org/10.1109/jsait.2021.3073257